Engineering a Simplified 0-Bit Consistent Weighted Sampling

نویسندگان

  • Edward Raff
  • Jared Sylvester
  • Charles Nicholas
چکیده

The Min-Hashing approach to sketching has become an important tool in data analysis, search, and classi cation. To apply it to real-valued datasets, the ICWS algorithm has become a seminal approach that is widely used, and provides state-of-the-art performance for this problem space. However, ICWS su ers a computational burden as the sketch size K increases. We develop a new Simpli ed approach to the ICWS algorithm, that enables us to obtain over 20x speedups compared to the standard algorithm. The veracity of our approach is demonstrated empirically on multiple datasets, showing that our new Simpli ed CWS obtains the same quality of results while being an order of magnitude faster.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Search Based Weighted Multi-Bit Flipping Algorithm for High-Performance Low-Complexity Decoding of LDPC Codes

In this paper, two new hybrid algorithms are proposed for decoding Low Density Parity Check (LDPC) codes. Original version of the proposed algorithms named Search Based Weighted Multi Bit Flipping (SWMBF). The main idea of these algorithms is flipping variable multi bits in each iteration, change in which leads to the syndrome vector with least hamming weight. To achieve this, the proposed algo...

متن کامل

Search Based Weighted Multi-Bit Flipping Algorithm for High-Performance Low-Complexity Decoding of LDPC Codes

In this paper, two new hybrid algorithms are proposed for decoding Low Density Parity Check (LDPC) codes. Original version of the proposed algorithms named Search Based Weighted Multi Bit Flipping (SWMBF). The main idea of these algorithms is flipping variable multi bits in each iteration, change in which leads to the syndrome vector with least hamming weight. To achieve this, the proposed algo...

متن کامل

Impulsive Noise Elimination Considering the Bit Planes Information of the Image

Impulsive noise is one of the imposed defectives degrades the quality of images. Performance of many image processing applications directly depends on the quality of the input image. Hence, it is necessary to de-noise the degraded images without losing their valuable information such as edges. In this paper we propose a method to remove impulsive noise from color images without damaging the ima...

متن کامل

An Efficient Lapped Orthogonal Transform Image Coding Technique

A fast and computationally less complex coding techque is described which uses partial-LOT computation a l g o r i b and efficiently cfiscards perceptually insignificant high frequency transform coefficients. The coding process involves AC energy classification, human visual system weighted normalization and quantization. The values of normalization factors are image independent and governed on...

متن کامل

Uniform generation of RNA pseudoknot structures with genus filtration

In this paper we present a sampling framework for RNA structures of fixed topological genus. We introduce a novel, linear time, uniform sampling algorithm for RNA structures of fixed topological genus g, for arbitrary g > 0. Furthermore we develop a linear time sampling algorithm for RNA structures of fixed topological genus g that are weighted by a simplified, loop-based energy functional. For...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2018